Read this chapter to satisfy your curiosity about how people locate, retrieve, and use information that's available on the Internet. The answers include these:
I should point out from the beginning that any of the communications techniques described in Chapter 6 can also serve as research tools. People learn a lot from mailing lists and newsgroupsand when they don't see the answers, they can post their
own questions and receive a dozen or more answers within hours.
When a user can scare up the Internet address of an expert (something that's getting easier as more folks list their Internet addresses in books and at the ends of articles in magazines and journals), he or she can e-mail a polite question. If the
expert is not an especially busy expert, he or she may respond. Those lucky enough to establish an e-mail rapport with knowledgeable people may even dare to take the relationship a step further, initiating a Talk session to make those discoveries that come
only through conversation.
So although the techniques in this chapter overlap those in Chapter 6, here you'll mostly see how users extract information straight from the other computers on the Internet, usually without having to enlist the assistance of another person.
Newsgroups and mailing lists (described in Chapter 6) are two Internet resources that make finding and supplying information easy because they all work basically the same way; that is, if a user knows how to find information and post information in
one newsgroup, he or she knows how to use them all. Ditto mailing lists, for the most part. Newsgroups achieve this consistency because they all conform to the Usenet format. Mailing lists are consistent because they all rely on the same e-mail-based
approach.
Unfortunately, many other Internet resources are not consistent in the way they look and in how they're used. Out on the Internet are thousands of computer systems that are managed not for the benefit of the whole Internet community but for the use of
a smaller groupa university faculty, a research staff, a federal agency. Each system has its own way of doing things: a different way of organizing information on the screen, a different set of commands, a different set of rules for who can do what.
These systems are set up in whatever way best suits their primary users, and Internet visitors are expected to learn, when in Rome, to speak the local language.
To access most of these computer systems, Internet folk use a facility called Telnet. Telnet enables an Internet user to drop in on another computer system and use it as if he or she were one of the computer's primary users. (The procedure is
sometimes also described as remote login.) Within some limitations that you'll learn about later, Internet users can look through a college's library catalog as if they were students at that college. Likewise, they can consult the resources of a think tank
as if they were one of the researchers, and poke around in the computer systems of government agencies, public libraries, and businesses.
But here's the catch: When an Internet user telnets into Harvard's campus information system, that user sees on his or her screen exactly what students and faculty at Harvard see the same lists of options, the same information. If a user telnets
to Dartmouth's system, that user sees what the students and faculty at Dartmouth see. The problem is that the systems of the two universities aren't used in exactly the same way (see Figure 7.1). Before the Internet user can use any given resource through
Telnet productively, he or she may have to feel around a little to figure out how things are done there.
Learning to use e-mail, newsgroups, and mailing lists is a snap, as you may have discovered while reading earlier chapters. More than anything else, it is the inconsistency among the many Telnet systems that leads people to see the Internet as a
complicated system usable only by computer virtuosos.
Fortunately, any computer user with a little experience can quickly come up to speed of most Telnet systems. As Figure 7.1 suggests, nearly all Telnet systems can be operated through menus.
Figure 7.1a. and 7.1b Two different university information systems on the Internet, reached through Telnet.
Telnet resources vary not just in their menuing systems, but in other ways. In particular, they may require a procedure called "logging on" or "signing on"supplying a username and password in order to use the system. The
username/pass-word routine exists to enable the computer to keep track of its users and to control access to sensitive information or features.
For example, a professor at a university may be allowed to look at grade reports, but students may not be. In a business, the chief accountant and other executive officers are typically allowed to see financial data that most employees are forbidden to
see. Because each user of a computer system signs on with a unique username, the computer knows who's who and can restrict a user from seeing what that person is not supposed to see. The secret passwords ensure that people can't use the system dishonestly
by signing on with someone else's username.
Not being the primary users of most computers accessed through Telnet, Internet users have no username or password for each individual computer system. So the Telnet computer systems have two ways of letting Internet users get on board:
In either case, the computer systems typically give Internet users the most restrictive security setting, allowing them access only to the most public information available on the system and keeping them out of private information and services that may
share the same computer.
Users of Telnet are vexed in yet one more way: terminal types. Again, because the various computer systems were designed primarily for their local users and not for everybody on the Internet, their methods for displaying information on-screen may be
specifically tailored to a particular type of computer terminal or workstation. On other types of terminals, the displayed words may come out garbled.
Internet users get around this problem with terminal emulation software that teaches their computers to mimic the terminal the Telnet system wants. Even so, this is one more inconsistency that makes using the Internet trickier than using a commercial
online service such as CompuServe or America Online, where everything looks and acts the same.
Many Telnet sites try to help users with this problem. And why shouldn't they? They've gone to the trouble of making themselves available to Internet users; they might as well smooth the path a little. The help available comes in several forms:
Figure 7.2. A help item (the top item in the left-hand column) on a menu in FedWorld, a database of Federal government information. To get help, the user presses B.
Most systems accessible through Telnet are overseen by one or more system operators, or sysops. Many systems display the name and e-mail address of the sysop when the user signs on, or they offer menu items that lead the user to a facility for
communicating with the sysop. Sysops are busy people, and they're paid to assist the primary users, so they usually don't like answering zillions of questions from Internet peopleespecially when the answers are available through help items or other
methods that don't involve the sysop. When all else fails, however, Internet users can approach the sysop and almost always get the answers they need.
Systems available through Telnet include:
The Agriculture and Nutrition database at the University of Pennsylvania.
The Earthquake information system at Washington University.
CARL, a database of Colorado library catalogs, book reviews, articles, and more (see Chapter 2).
A thought-for-the-day service from Temple University.
Chess games and the oriental strategy game, Go (see Chapter 8).
NASA's National Space Science Data Center.
A history database at the University of Kansas.
The Concise Oxford Dictionary at Rutgers University.
A database of information about alcoholism and substance abuse at Dartmouth.
The Library of Congress catalog, through the University of Minnesota.
A bulletin board at the American Philosophical Association.
Many university information systems and their libraries.
Computer files are available all over the Internet, and they contain everything that can be stored in a computer file:
There are four basic ways Internet users copy files from other computers on the Internet to their own:
This last optionFTPoffers the greatest breadth of information. A large subset of computers on the Internet are known as FTP sites. Many of these are also Telnet sites. What the user sees when accessing these sites, however, depends on how
he or she goes in: If the user goes in through Telnet, he or she sees menus for using the information services available there. If the user go in with FTP, he or she sees lists of files available for copying. As with Telnet, users are usually required to
sign on to the computer with a username and password. The exception is "anonymous" FTP sites; these are set up to allow anybody to copy files without identification. (Actually, the process isn't completely anonymous. The computer at the FTP site
still knows who the user is, thanks to behind-the-scenes communications between the user's computer and the FTP site. The advantage of the anonymous FTP site is that users don't have to remember passwords to get files.)
Typically, the lists of files at an FTP site are organized in groups called directories (see Figure 7.3). Directories can contain lists of other directories, each of which can contain a list of still more directories. So it can take some effort to plow
through the lists and directories to locate a particular file.
Figure 7.3. A list of files available for copying, as seen by an Internet user through FTP.
Sometimes the names of the directories and files will help a user find what he's looking for, but more often, they're cryptic. Many FTP users don't go fishingthey use FTP only when they know exactly what they're looking for and where to find it.
There is a facility that makes finding and copying files through FTP easier. Affectionately named Archie, the facility finds files available through FTP that match a name (or part of a name) supplied by the user.
When a user has at least a vague idea of what the desired file is called, he or she accesses a computer called an Archie server. The user's access provider may have an Archie server; if not, any user can Telnet to any of several Archie servers. The
user then types all or part of the filename and puts the Archie server to work. (The user can, in fact, tailor the search in several ways to make the search more accurate.) Archie searches a group of lists containing the names of files at various FTP
sites, and if successful, it shows the user where to get the desired file. The user can then use FTP to access the FTP site and copy the file.
Menus make using an Internet resource easierafter the user makes his way to the resource. The challenges of finding a resource, getting onto it, and figuring out the peculiarities of its menu and command system conspire against all but the
bravest Internet users. But what if there were a system that allowed a user to find a resource, get to it, use it, and even copy files, all from a fairly consistent, easy-to-use set of menus? That's the idea behind Gopher, which along with Mosaic has made
the Internet much easier to use and in doing so has opened up the Internet to a new group of users.
Named for the mascot at University of Minnesota where it was developed, Gopher is a system of menus that allows users to "browse" for information simply by moving through the menus (see Figure 7.4). The beauty is that Gopher's activities are
spread across a wide range of Internet sites and resources, but it insulates the user from the tasks of choosing an Internet site, using Telnet or FTP to get to it, signing on, and more. All that stuff is done behind the scenesor eliminated
altogether.
Instead, the user picks through menus of subjects, sites, regions, or other ways of breaking down the possibilitiesin effect, the user "burrows" for information (hence the other explanation for Gopher's name). Even upon arriving at a
specific resource, the user sees the same style of menus, which work the same way. All the resources that participate in Gopherspace (the catch-all term describing the sum of all the resources accessible through Gopher menus) have agreed to play by the
same rules to keep things simple.
Figure 7.4. A Gopher menu, seen through a Gopher software tool.
While moving around the menus within Gopherspace, a user may jump from one Internet resource to another without even knowing it. Gopher makes the collection of resources that support it (which today include a strong collection of universities and other
sites, but by no means all of the Internet) seem like part of one big, consistent, smooth service where everything works the same way.
Gopher is available to all Internet users, but it really shines for those who have software tools designed to take special advantage of it, tools like the one shown in Figure 7.4. A PC program that uses the Microsoft Windows environment, WinGopher
allows the user to burrow through Gopherspace by clicking the mouse on items of interest.
At the end of a browsing session, a user may find himself at a menu of computer files. If he or she wants to copy one, the user simply picks it from the menu; just as Gopher shields the user from the details of Telnet, so too it hides the FTP activity
required to copy the file.
Browsing is a great way to find stuff, but if you've ever tried to find one book in a big library by browsing alone, you know that it's not an efficient approach.
To solve this problem, Gopher has a companion tool, Veronica. Just as Archie helps users find files, Veronica enables the user to type a "search term" to find Gopher menus and items on a particular subject. There are several ways to search
with Veronica, but typically the user types one word or more for the search term. Veronica searches Gopherspace, finds all the matching menu items, and creates and displays a new menu listing the items that match. The user can then choose any item from
that menu.
Gopherspace is made up of Internet computers that are set up to play by the Gopher's rules so that users can access resources easily. Gopherspace includes only a small subset of the whole Internet, but when the superset is as large as the Internet, a
small subset can be pretty big. There are hundreds of Gopher sites offering millions of documents, files, and other resources, and more sites and resources join Gopherspace regularly.
Another subset of the Internet is called the World Wide Web (also known as the WWW, or the Web). The sites on the WWW have also agreed to arrange their resources on menus and to make information "browseable" by subject or by site; but the
folks on the WWW take browsing a step further.
When a user looks at menus, documents, or other screenfuls of information on the WWW, the user sees various words highlighted in some way so that they stand out from the rest. The actual way the words are highlighted varies a little by resource and
depends on what type of software is used to access the WWWa number may appear next to the word or the word may be displayed in bold or in a different color. However they're shown, the highlighted words are called keywords (see Figure 7.5). Keywords
are doorways to information.
Figure 7.5. A WWW menu, with highlighted keywords that lead to related information.
A user chooses a keyword in much the same way that he or she might choose a menu item. When the user does so, a new screen appears with related information and perhaps still more keywords to choose from. Users find out what they need to know by reading
what's on each screen, then drilling down to more specific, related information through the keywords. Users can also easily "back out" one step at a time through the screens they've read, to choose a different keyword and start down a different
path. It's much like using menus, but far more powerful and flexible because it allows users to jump spontaneously from idea to idea. Every document can double as a menu to more documents.
Originally called hypertext, the keyword approach has recently been renamed hypermedia because the WWW is evolving into a source of multimedia informationincluding text, pictures, sound, and even video.
To take advantage of hypermedia, the user needs an Internet software tool that can deal with multimedia information. That's where WWW browsers come in. Several are available, including a program called Cello from Cornell University and Lynx from the
University of Kansas, which is currently the most-used WWW browser. A newer freeware program called Mosaic, however, has been getting the lion's share of the attention lately. Perhaps second only to Gopher, Mosaic has encouraged the press and everybody
else to rethink the Internet and consider its potential as a resource within the reach of novices. Mosaic isn't perfectit has technical limitations that restrict it to users who have a particular type of Internet connection. (Users with the wrong
connection for Mosaic can use other WWW browsers.) But it's catching on like wildfire among those who can use it.
Created by the National Center for Supercomputing Applications at the University of Illinois at Champaign-Urbana, Mosaic provides a graphical window to the WWW through which users can "point and click" with a mouse to choose keywords. Because
it can display graphics on the screen (it's available for several graphical computing environments, including PCs with Windows and the Macintosh), Mosaic can show on the screen little pictures (icons) that serve as keywords for accessing multimedia
information.
When a user selects an icon that leads to a multimedia item, Mosaic determines which type of software is necessary to play that item, then starts up the required program. For example, if the user selects an icon that leads to a sound clip, Mosaic
starts another program that is capable of playing sound clips. When the user has finished listening to the clip, Mosaic exits the sound program and returns the user to the WWW screen. If an icon leads to a picture, Mosaic starts a viewer program to display
it.
Figures 7.6 and 7.7 illustrate the power of hypermedia through Mosaic. The resource shown is a multimedia medical textbook created by the University of Iowa. The user opens the textbook, reads each screen, and uses keywords to jump to related
information at will. When an illustration is available to help the reader understand a concept, an icon appears on the screen, as shown in Fig-
ure 7.6.
When the user moves the mouse pointer to an icon and clicks the mouse button, Mosaic starts a viewer to show the full illustration, as shown in Figure 7.7. (Notice that the icon in Figure 7.6 is a miniature version of the full illustration shown in
Figure 7.7.) When the user finishes looking at the illustration, Mosaic closes the viewer and returns to the screen shown in Figure 7.6.
Figure 7.6. A Mosaic screen featuring hypermedia icons.
Figure 7.7. The picture displayed after the user selects an icon, shown through a viewing program automatically started by Mosaic.
Internet users rely on a range of tools and techniques for digging up informationsome are easy, some aren't. Among the research tools and facilities making the Internet both easier and more powerful are
Other Internet research facilitiessuch as Telnet and FTPcan require more skill and patience. Most folks, however, find the effort rewarding, given the great depth and breadth of resources the Internet provides.